AITopics | euphemism detection

Collaborating Authors

euphemism detection

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Impromptu Cybercrime Euphemism Detection

Li, Xiang, Zhou, Yucheng, Zhao, Laiping, Li, Jing, Liu, Fangming

arXiv.org Artificial IntelligenceDec-3-2024

Detecting euphemisms is essential for content security on various social media platforms, but existing methods designed for detecting euphemisms are ineffective in impromptu euphemisms. In this work, we make a first attempt to an exploration of impromptu euphemism detection and introduce the Impromptu Cybercrime Euphemisms Detection (ICED) dataset. Moreover, we propose a detection framework tailored to this problem, which employs context augmentation modeling and multi-round iterative training. Our detection framework mainly consists of a coarse-grained and a fine-grained classification model. The coarse-grained classification model removes most of the harmless content in the corpus to be detected. The fine-grained model, impromptu euphemisms detector, integrates context augmentation and multi-round iterations training to better predicts the actual meaning of a masked token. In addition, we leverage ChatGPT to evaluate the mode's capability. Experimental results demonstrate that our approach achieves a remarkable 76-fold improvement compared to the previous state-of-the-art euphemism detector.

cybercrime euphemism, dataset, euphemism, (12 more...)

arXiv.org Artificial Intelligence

2412.01413

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > Texas (0.04)
(13 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.46)

Add feedback

Turkish Delights: a Dataset on Turkish Euphemisms

Biyik, Hasan Can, Lee, Patrick, Feldman, Anna

arXiv.org Artificial IntelligenceJul-17-2024

Euphemisms are a form of figurative language relatively understudied in natural language processing. This research extends the current computational work on potentially euphemistic terms (PETs) to Turkish. We introduce the Turkish PET dataset, the first available of its kind in the field. By creating a list of euphemisms in Turkish, collecting example contexts, and annotating them, we provide both euphemistic and non-euphemistic examples of PETs in Turkish. We describe the dataset and methodologies, and also experiment with transformer-based models on Turkish euphemism detection by using our dataset for binary classification. We compare performances across models using F1, accuracy, and precision as evaluation metrics.

dataset, euphemism, euphemism detection, (14 more...)

arXiv.org Artificial Intelligence

2407.1304

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.05)
North America > United States > New Jersey (0.04)
North America > Canada > Ontario > Toronto (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)

Add feedback

MEDs for PETs: Multilingual Euphemism Disambiguation for Potentially Euphemistic Terms

Lee, Patrick, Trujillo, Alain Chirino, Plancarte, Diana Cuevas, Ojo, Olumide Ebenezer, Liu, Xinyi, Shode, Iyanuoluwa, Zhao, Yuan, Peng, Jing, Feldman, Anna

arXiv.org Artificial IntelligenceJan-25-2024

This study investigates the computational processing of euphemisms, a universal linguistic phenomenon, across multiple languages. We train a multilingual transformer model (XLM-RoBERTa) to disambiguate potentially euphemistic terms (PETs) in multilingual and cross-lingual settings. In line with current trends, we demonstrate that zero-shot learning across languages takes place. We also show cases where multilingual models perform better on the task compared to monolingual models by a statistically significant margin, indicating that multilingual data presents additional opportunities for models to learn about cross-lingual, computational properties of euphemisms. In a follow-up analysis, we focus on universal euphemistic "categories" such as death and bodily functions among others. We test to see whether cross-lingual data of the same domain is more important than within-language data of other domains to further understand the nature of the cross-lingual transfer.

dataset, euphemism, experiment, (11 more...)

arXiv.org Artificial Intelligence

2401.14526

Country:

North America > Canada > Ontario > Toronto (0.05)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.05)
North America > United States > New Jersey (0.04)
(2 more...)

Genre: Research Report > New Finding (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

TEDB System Description to a Shared Task on Euphemism Detection 2022

Wiriyathammabhum, Peratham

arXiv.org Artificial IntelligenceJan-16-2023

In this report, we describe our Transformers for euphemism detection baseline (TEDB) submissions to a shared task on euphemism detection 2022. We cast the task of predicting euphemism as text classification. We considered Transformer-based models which are the current state-of-the-art methods for text classification. We explored different training schemes, pretrained models, and model architectures. Our best result of 0.816 F1-score (0.818 precision and 0.814 recall) consists of a euphemism-detection-finetuned TweetEval/TimeLMs-pretrained RoBERTa model as a feature extractor frontend with a KimCNN classifier backend trained end-to-end using a cosine annealing scheduler. We observed pretrained models on sentiment analysis and offensiveness detection to correlate with more F1-score while pretraining on other tasks, such as sarcasm detection, produces less F1-scores. Also, putting more word vector channels does not improve the performance in our experiments.

machine learning, natural language, text classification, (16 more...)

arXiv.org Artificial Intelligence

2301.06602

Country:

Europe > United Kingdom (0.04)
Europe > Spain (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
(5 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

A Report on the Euphemisms Detection Shared Task

Lee, Patrick, Feldman, Anna, Peng, Jing

arXiv.org Artificial IntelligenceDec-3-2022

This paper presents The Shared Task on Euphemism Detection for the Third Workshop on Figurative Language Processing (FigLang 2022) held in conjunction with EMNLP 2022. Participants were invited to investigate the euphemism detection task: given input text, identify whether it contains a euphemism. The input data is a corpus of sentences containing potentially euphemistic terms (PETs) collected from the GloWbE corpus (Davies and Fuchs, 2015), and are human-annotated as containing either a euphemistic or literal usage of a PET. In this paper, we present the results and analyze the common themes, methods and findings of the participating teams

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2211.13327

Country:

South America > Colombia > Meta Department > Villavicencio (0.04)
North America > United States > New Jersey (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)

Add feedback

Exploring Euphemism Detection in Few-Shot and Zero-Shot Settings

Keh, Sedrick Scott

arXiv.org Artificial IntelligenceOct-23-2022

Compared to other figures of speech like similes (Chakrabarty et al., 2020) and metaphors Euphemisms are figures of speech which aim to (Chakrabarty et al., 2021), work on euphemisms soften the blow of certain words which may be has been limited. Recently, Gavidia et al. (2022); too direct or too harsh (Magu and Luo, 2018; Felt Lee et al. (2022) released a new dataset of diverse and Riloff, 2020). In the EMNLP 2022 FigLang euphemisms and conducted analysis on automatically Workshop Euphemism Shared Task, participating identifying potentially euphemistic terms. In teams are given a set of sentences with potentially the past, Felt and Riloff (2020) used sentiment analysis euphemistic terms (PETs) enclosed in brackets, and techniques to recognize euphemistic and dysphemistic the task is to classify whether or not the PET in a phrases. Other studies also focused on given sentence is used euphemistically.

euphemism, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2210.12926

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

CATs are Fuzzy PETs: A Corpus and Analysis of Potentially Euphemistic Terms

Gavidia, Martha, Lee, Patrick, Feldman, Anna, Peng, Jing

arXiv.org Artificial IntelligenceMay-5-2022

Euphemisms have not received much attention in natural language processing, despite being an important element of polite and figurative language. Euphemisms prove to be a difficult topic, not only because they are subject to language change, but also because humans may not agree on what is a euphemism and what is not. Nevertheless, the first step to tackling the issue is to collect and analyze examples of euphemisms. We present a corpus of potentially euphemistic terms (PETs) along with example texts from the GloWbE corpus. Additionally, we present a subcorpus of texts where these PETs are not being used euphemistically, which may be useful for future applications. We also discuss the results of multiple analyses run on the corpus. Firstly, we find that sentiment analysis on the euphemistic texts supports that PETs generally decrease negative and offensive sentiment. Secondly, we observe cases of disagreement in an annotation task, where humans are asked to label PETs as euphemistic or not in a subset of our corpus text examples. We attribute the disagreement to a variety of potential reasons, including if the PET was a commonly accepted term (CAT).

disagreement, euphemism, interpretation, (13 more...)

arXiv.org Artificial Intelligence

2205.02728

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.34)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.34)

Add feedback